# install packages if necessary
# install.packages(c("censusapi", "tigris"))
# activate packages
library(censusapi) # retrieving census attribute data
library(tigris) # retrieving census geometries
library(sf) # manipulating geometry data
library(dplyr) # data wrangling, pip format
library(tidyverse) # data wrangling
library(biscale) # bivariate mapping
library(corrplot) # correlation plot
library(scales) # used for rescaling data
options(scipen=999, digits = 2) # format output for data tablesGEO 3/446 Exercise #1
Bivariate Choropleth Mapping in R using CDC PLACES and SVI Data
Purpose
This exercise uses Centers for Disease Control PLACES and Social Vulnerability Index (SVI) data in RStudio and ArcGIS Pro to create a series of bivariate choropleth maps and related figures and tables to explore select public health and sociodemographic relationships.
Students will save maps and summaries of spatial analyses in a PowerPoint presentation (or, optionally, a document created with Quarto markdown in R) to be submitted via the course web page. Students are welcome to work collectively with classmates to overcome obstacles experienced while completing the required tasks, although all submissions will be unique given that students will choose their own custom study area and variables for mapping.
The sections below step you through the process to create the above deliverable. For some of the more complicated procedures, online videos will be made available on the course D2L under the Exercise #1 content folder.
Step 1. Create new R project, folders, and data processing script
Effective file management is a foundational aspect of geographic information systems (GIS) that contributes to data organization, integrity, collaboration, efficiency, security, scalability, documentation, compliance, version control, and data accessibility. Ignoring file management can lead to data loss, errors, and inefficiencies in GIS projects, whereas conscientious file management enhances the overall efficiency and reliability of GIS workflows.
That said, in RStudio, create a project within a new directory (e.g., GEO336/exercise_01) of your general course folder. Here you will save files related to exercise #1. Create the following folders within the new exercise-specific directory: “scripts” (for storing R code), “layers” (for storing geographic feature layers) and “maps” (for storing ArcGIS Pro projects).
Create data processing script in R
In RStudio, create a new script file for storing data processing code relevant to this exercise. Create the blank script file by first navigating to the “scripts” directory you created in your project working directory and clicking “New Blank File” in the files pane. Name the script something intuitive (e.g., “exercise01.R”).
Activate relevant packages
Next, use the following code to activate needed R packages. If the packages are not yet installed, you can install multiple packages by passing a vector of package names to the “install.packages” function, for example:
install.packages(c("censusapi", "tigris"))
Section 2. Select a study area and download boundary files
For this exercise, you will select a custom, large (i.e., having over 500,000 population), urban county located in the contiguous United States to be your study area. Browse the interactive map of US counties below to identify a study area that interests you. Make note of the county’s name and unique “geoid”. The “geoid” combines the unique two-digit state code with a three-digit county code (e.g., 17031, for example, is the geoid for Cook County, Illinois).
Code
# Make interactive map of US counties by population using leaflet package
# assign bins for population thematic map loosely based on quintile breaks
counties_bins <- c(0, 25000, 100000, 500000, 999999, Inf)
# create palette using colorblind/accessible colors
counties_pal <- colorBin(
palette="viridis",
domain=us_counties_population_geo$total_pop,
na.color="transparent",
bins = counties_bins,
reverse = TRUE
)
# format/generate popup labels for map
labels = sprintf(
"<strong>%s, %s</strong><br/>
Population (2021): %s<br/>
GEOID: %s <br/>
Quintile group: %s",
us_counties_population_geo$NAME,
us_counties_population_geo$STATE_NAME,
format(us_counties_population_geo$total_pop, big.mark = ","),
us_counties_population_geo$GEOID,
us_counties_population_geo$total_pop_q) %>%
lapply(htmltools::HTML)
# create thematic map using leaflet
leaflet() %>%
addProviderTiles(providers$CartoDB.Positron) %>%
addPolygons(group = "US counties",
data=us_counties_population_geo %>% st_transform(crs=4326),
# fillColor = "orange",
fillColor = ~counties_pal(total_pop),
weight = 0.5,
opacity = 0.5,
color = "white",
dashArray = "3",
fillOpacity = 0.7,
highlight = highlightOptions(
weight = 5,
color = "#666",
dashArray = "",
fillOpacity = 1,
bringToFront = FALSE),
label = labels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto")) %>%
addLegend("topright", pal = counties_pal, values = us_counties_population_geo$total_pop,
title = paste0("Population (2021)"),
opacity = 1) %>%
addScaleBar(position = "bottomleft") %>%
addFullscreenControl()